skip to main content


Search for: All records

Creators/Authors contains: "De Choudhury, Munmun"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Applied machine learning (ML) has not yet coalesced on standard practices for research ethics. For ML that predicts mental illness using social media data, ambiguous ethical standards can impact peoples’ lives because of the area’s sensitivity and material con- sequences on health. Transparency of current ethics practices in research is important to document decision-making and improve research practice. We present a systematic literature review of 129 studies that predict mental illness using social media data and ML, and the ethics disclosures they make in research publications. Rates of disclosure are going up over time, but this trend is slow moving – it will take another eight years for the average paper to have coverage on 75% of studied ethics categories. Certain practices are more readily adopted, or "stickier", over time, though we found pri- oritization of data-driven disclosures rather than human-centered. These inconsistently reported ethical considerations indicate a gap between what ML ethicists believe ought to be and what actually is done. We advocate for closing this gap through increased trans- parency of practice and formal mechanisms to support disclosure. 
    more » « less
    Free, publicly-accessible full text available June 12, 2024
  2. Explainable AI (XAI) systems are sociotechnical in nature; thus, they are subject to the sociotechnical gap-divide between the technical affordances and the social needs. However, charting this gap is challenging. In the context of XAI, we argue that charting the gap improves our problem understanding, which can reflexively provide actionable insights to improve explainability. Utilizing two case studies in distinct domains, we empirically derive a framework that facilitates systematic charting of the sociotechnical gap by connecting AI guidelines in the context of XAI and elucidating how to use them to address the gap. We apply the framework to a third case in a new domain, showcasing its affordances. Finally, we discuss conceptual implications of the framework, share practical considerations in its operationalization, and offer guidance on transferring it to new contexts. By making conceptual and practical contributions to understanding the sociotechnical gap in XAI, the framework expands the XAI design space. 
    more » « less
  3. We collected Instagram data from 150 adolescents (ages 13-21) that included 15,547 private message conversations of which 326 conversations were flagged as sexually risky by participants. Based on this data, we leveraged a human-centered machine learning approach to create sexual risk detection classifiers for youth social media conversations. Our Convolutional Neural Network (CNN) and Random Forest models outperformed in identifying sexual risks at the conversation-level (AUC=0.88), and CNN outperformed at the message-level (AUC=0.85). We also trained classifiers to detect the severity risk level (i.e., safe, low, medium-high) of a given message with CNN outperforming other models (AUC=0.88). A feature analysis yielded deeper insights into patterns found within sexually safe versus unsafe conversations. We found that contextual features (e.g., age, gender, and relationship type) and Linguistic Inquiry and Word Count (LIWC) contributed the most for accurately detecting sexual conversations that made youth feel uncomfortable or unsafe. Our analysis provides insights into the important factors and contextual features that enhance automated detection of sexual risks within youths' private conversations. As such, we make valuable contributions to the computational risk detection and adolescent online safety literature through our human-centered approach of collecting and ground truth coding private social media conversations of youth for the purpose of risk classification. 
    more » « less
  4. Instagram, one of the most popular social media platforms among youth, has recently come under scrutiny for potentially being harmful to the safety and well-being of our younger generations. Automated approaches for risk detection may be one way to help mitigate some of these risks if such algorithms are both accurate and contextual to the types of online harms youth face on social media platforms. However, the imminent switch by Instagram to end-to-end encryption for private conversations will limit the type of data that will be available to the platform to detect and mitigate such risks. In this paper, we investigate which indicators are most helpful in automatically detecting risk in Instagram private conversations, with an eye on high-level metadata, which will still be available in the scenario of end-to-end encryption. Toward this end, we collected Instagram data from 172 youth (ages 13-21) and asked them to identify private message conversations that made them feel uncomfortable or unsafe. Our participants risk-flagged 28,725 conversations that contained 4,181,970 direct messages, including textual posts and images. Based on this rich and multimodal dataset, we tested multiple feature sets (metadata, linguistic cues, and image features) and trained classifiers to detect risky conversations. Overall, we found that the metadata features (e.g., conversation length, a proxy for participant engagement) were the best predictors of risky conversations. However, for distinguishing between risk types, the different linguistic and media cues were the best predictors. Based on our findings, we provide design implications for AI risk detection systems in the presence of end-to-end encryption. More broadly, our work contributes to the literature on adolescent online safety by moving toward more robust solutions for risk detection that directly takes into account the lived risk experiences of youth. 
    more » « less
  5. Abstract Misinformation about the COVID-19 pandemic proliferated widely on social media platforms during the course of the health crisis. Experts have speculated that consuming misinformation online can potentially worsen the mental health of individuals, by causing heightened anxiety, stress, and even suicidal ideation. The present study aims to quantify the causal relationship between sharing misinformation, a strong indicator of consuming misinformation, and experiencing exacerbated anxiety. We conduct a large-scale observational study spanning over 80 million Twitter posts made by 76,985 Twitter users during an 18.5 month period. The results from this study demonstrate that users who shared COVID-19 misinformation experienced approximately two times additional increase in anxiety when compared to similar users who did not share misinformation. Socio-demographic analysis reveals that women, racial minorities, and individuals with lower levels of education in the United States experienced a disproportionately higher increase in anxiety when compared to the other users. These findings shed light on the mental health costs of consuming online misinformation. The work bears practical implications for social media platforms in curbing the adverse psychological impacts of misinformation, while also upholding the ethos of an online public sphere. 
    more » « less
  6. Online sexual risks pose a serious and frequent threat to adolescents’ online safety. While significant work is done within the HCI community to understand teens’ sexual experiences through public posts, we extend their research by qualitatively analyzing 156 private Instagram conversations flagged by 58 adolescents to understand the characteristics of sexual risks faced with strangers, acquaintances, and friends. We found that youth are often victimized by strangers through sexual solicitation/harassment as well as sexual spamming via text and visual media, which is often ignored by them. In contrast, adolescents’ played mixed roles with acquaintances, as they were often victims of sexual harassment, but sometimes engaged in sexting, or interacted by rejecting sexual requests from acquaintances. Lastly, adolescents were never recipients of sexual risks with their friends, as they mostly mutually participated in sexting or sexual spamming. Based on these results, we provide our insights and recommendations for future researchers. Trigger Warning: This paper contains explicit language and anonymized private sexual messages. Reader discretion advised. 
    more » « less
  7. Sexual exploration is a natural part of adolescent development; yet, unmediated internet access has enabled teens to engage in a wider variety of potentially riskier sexual interactions than previous generations, from normatively appropriate sexual interactions to sexually abusive situations. Teens have turned to online peer support platforms to disclose and seek support about these experiences. Therefore, we analyzed posts (N=45,955) made by adolescents (ages 13--17) on an online peer support platform to deeply examine their online sexual risk experiences. By applying a mixed methods approach, we 1) accurately (average of AUC = 0.90) identified posts that contained teen disclosures about online sexual risk experiences and classified the posts based on level of consent (i.e., consensual, non-consensual, sexual abuse) and relationship type (i.e., stranger, dating/friend, family) between the teen and the person in which they shared the sexual experience, 2) detected statistically significant differences in the proportions of posts based on these dimensions, and 3) further unpacked the nuance in how these online sexual risk experiences were typically characterized in the posts. Teens were significantly more likely to engage in consensual sexting with friends/dating partners; unwanted solicitations were more likely from strangers and sexual abuse was more likely when a family member was involved. We contribute to the HCI and CSCW literature around youth online sexual risk experiences by moving beyond the false dichotomy of "safe" versus "risky". Our work provides a deeper understanding of technology-mediated adolescent sexual behaviors from the perspectives of sexual well-being, risk detection, and the prevention of online sexual violence toward youth. 
    more » « less
  8. Current youth online safety and risk detection solutions are mostly geared toward parental control. As HCI researchers, we acknowledge the importance of leveraging a youth-centered approach when building Artificial Intelligence (AI) tools for adolescents online safety. Therefore, we built the MOSafely, Is that ‘Sus’ (youth slang for suspicious)? a web-based risk detection assessment dashboard for youth (ages 13-21) to assess the AI risks identified within their online interactions (Instagram and Twitter Private conversations). This demonstration will showcase our novel system that embedded risk detection algorithms for youth evaluations and adopted the human–in–the loop approach for using youth evaluations to enhance the quality of machine learning models. 
    more » « less
  9. We collected Instagram Direct Messages (DMs) from 100 adolescents and young adults (ages 13-21) who then flagged their own conversations as safe or unsafe. We performed a mixed-method analysis of the media files shared privately in these conversations to gain human-centered insights into the risky interactions experienced by youth. Unsafe conversations ranged from unwanted sexual solicitations to mental health related concerns, and images shared in unsafe conversations tended to be of people and convey negative emotions, while those shared in regular conversations more often conveyed positive emotions and contained objects. Further, unsafe conversations were significantly shorter, suggesting that youth disengaged when they felt unsafe. Our work uncovers salient characteristics of safe and unsafe media shared in private conversations and provides the foundation to develop automated systems for online risk detection and mitigation. 
    more » « less
  10. In this work, we present a case study on an Instagram Data Donation (IGDD) project, which is a user study and web-based platform for youth (ages 13-21) to donate and annotate their Instagram data with the goal of improving adolescent online safety. We employed human-centered design principles to create an ecologically valid dataset that will be utilized to provide insights from teens’ private social media interactions and train machine learning models to detect online risks. Our work provides practical insights and implications for Human-Computer Interaction (HCI) researchers that collect and study social media data to address sensitive problems relating to societal good. 
    more » « less